Generalizing from Case studies: A Case Study
نویسنده
چکیده
Most empirical evaluations of machine learning algorithms are case studies { evaluations of multiple algorithms on multiple databases. Authors of case studies implicitly or explicitly hypothesize that the pattern of their results, which often suggests that one algorithm performs signiicantly better than others, is not limited to the small number of databases investigated, but instead holds for some general class of learning problems. However, these hypotheses are rarely supported with additional evidence, which leaves them suspect. This paper describes an empirical method for generalizing results from case studies and an example application. This method yields rules describing when some algorithms signiicantly outperform others on some dependent measures. Advantages for generalizing from case studies and limitations of this particular approach are also described. A central objective in machine learning research is to determine the conditions describing when one heuristic learning algorithmoutperforms others for a given set of dependent variables (e.g., predictive accuracy, speed, storage, etc.). Although formal mathematical analyses are preferred to detail these conditions in the form of average expected computational behavior (Pazzani & Sarrett, 1990), such results are diicult to produce since the algorithms and/or databases are usually complex. Instead, empirical evaluations are conducted to yield case study results { measures of some dependent variable(s) obtained from applying a set of algorithms to one or more carefully selected databases (e. Invariably, some algorithms are reported to signiicantly outperform others in the case study. Although authors usually hypothesize why these performance diierences occurred, their explanations are infrequently evaluated and may be inaccurate. More systematic methods are required to accurately generalize case study results. Few methods for generalizing case studies have been reported in the machine learning literature. However, the approach introduced in this paper has much in common with Rendell and Cho's (1990) investigations. They used artiicially-generated databases to examine how the performances of two similar algorithms were aaected by several data characteristics, particularly concept size (i.e., the percentage of positive instances) and concept concentration (i.e., the number of prototypes deening the target concept). This paper instead focuses on a general method that characterizes the situations when arbitrarily diierent learning algorithms have a constant signiicant performance diierence. More speciically, this paper details a simple empirical method that generalizes case studies. It is independent of the set of dependent and independent variables being investigated, the selected learning task, and the selected learning algorithms. The objective of this generalization method is to …
منابع مشابه
Comparison of Internalizing Disorders in 8-14-Year-Old Offsprings of Opium and Heroin Dependent Parents: A Case- Control Study
Abstract Background:In general, parental substance abuse is associated with children's emotional and behavioral problems. This study only investigated the internalizing problems (depression, anxiety and physical complains) in children of opioid or heroin-dependent parents in comparison with non-opioid dependent parents in order to determine the effects of drug dependency after exclu...
متن کاملThe ‘ Critical Case ’ in Information Systems Research
Information systems research has taken many different directions and a host of different approaches have been used. Different researchers within the multidisciplinary field of IS seem to prefer some approaches to others based on both epistemological and more practical grounds. In this paper we look at the case study approach in IS research and focus on the importance of selecting ‘critical case...
متن کاملA 3D elasto-plastic FEM program developed for reservoir Geomechanics simulations: Introduction and case studies
The development of yielded or failure zone due to an engineering construction is a subject of study in different disciplines. In Petroleum engineering, depletion from and injection of gas into a porous rock can cause development of a yield zone around the reservoir. Studying this phenomenon requires elasto-plastic analysis of geomaterial, in this case the porous rocks. In this study, which is a...
متن کاملبررسی روند مطالعات مورد ـ شاهد لانه گزیده و مقایسهی آن با مطالعات همگروهی و مورد ـ شاهد از سال 2005 ـ 1996
Background & Objectives: This investigation was prompted by the growing importance of nested case-control studies and the increasing frequency with which they are done in epidemiologic research. After a brief explanation of nested case-control studies, we evaluate the trends in research methodology over the last decade, especially with regard to cohort, case-control, and nested case-control des...
متن کاملEstimating river suspended sediment yield using MLP neural network in arid and semi-arid basins Case study: Bar River, Neyshaboor, Iran
Abstract Erosion and sedimentation are the most complicated problems in hydrodynamic which are very important in water-related projects of arid and semi-arid basins. For this reason, the presence of suitable methods for good estimation of suspended sediment load of rivers is very valuable. Solving hydrodynamic equations related to these phenomenons and access to a mathematical-conceptual mode...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992